PDF Malware Prediction

Group 21:

Import DataFrame and Data Exploration

Data Cleaning and Preprocessing

More Data Exploration

Data Visualization

Correlation Analysis

Removing highly correlated features

Preparing data for model fitting

Scaling data

Train Test Split

Comparing Multiple Models

Logistic Regression

KNearest Neighbors

Support Vector Classifier

Base Model Preformance Evaluation

Oversampling

Feature Selection

SVM with Feature Selection

Grid Search CV

SVM with Optimal Hyperparameters

Final Model Performance Evaluation